Calibration of distance measures for unsupervised query-by-example

نویسندگان

  • Michele Gubian
  • Lou Boves
  • Maarten Versteegh
چکیده

Retrieving information from the ever-increasing amount of unannotated audio and video recordings requires techniques such as unsupervised pattern discovery or query-by-example. In this paper we focus on queries that are specified in the form of an audio snippet containing the desired word or expression excised from the target recordings. The task is to retrieve alland-only the instances whose match score with the query meet an absolute criterion. For this purpose we introduce a distance measure between two acoustic vectors that can be calibrated in a completely unsupervised manner. The use of that measure also allows the use of a fast matching approach, which makes it possible to skip more than 97% of full-fledged DTW without affecting performance in terms of precision and recall. We demonstrate the effectiveness of the proposals with query-byexample experiments conducted on a read speech corpus for English and a spontaneous speech corpus for Dutch.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Supervised and Unsupervised Sound Retrieval by Vocal Imitation

Searching sounds with text labels is often problematic and time consuming as text labels do not often describe the detailed audio content. Query by example is a way to improve the effectiveness and efficiency of sound retrieval. In this paper we propose a novel approach for sound query by example: query by vocal imitation. Vocal imitation is commonly used in human communication and can be emplo...

متن کامل

New distance and similarity measures for hesitant fuzzy soft sets

The hesitant fuzzy soft set (HFSS), as a combination of hesitant fuzzy and soft sets, is regarded as a useful tool for dealing with the uncertainty and ambiguity of real-world problems. In HFSSs, each element is defined in terms of several parameters with arbitrary membership degrees. In addition, distance and similarity measures are considered as the important tools in different areas such as ...

متن کامل

Unsupervised confidence calibration using examples of recognized words and their contexts

This paper presents a novel unsupervised calibration framework of word confidence measures for automatic speech recognition. It makes it possible to improve the quality of confidence measures in situations where the training of parametric models is hindered by a lack of human-labeled in-domain data. The proposed method calibrates confidence scores by utilizing recognition results stored in depl...

متن کامل

Semi-Supervised Information Retrieval System for Clinical Decision Support

This article summarizes the approach developed for TREC 2016 Clinical Decision Support Track. In order to address the daunting challenge of retrieval of biomedical articles for answering clinical questions, an information retrieval methodology was developed that combines pseudo-relevance feedback, semantic query expansion and document similarity measures based on unsupervised word embeddings. T...

متن کامل

Optimization of sediment rating curve coefficients using evolutionary algorithms and unsupervised artificial neural network

Sediment rating curve (SRC) is a conventional and a common regression model in estimating suspended sediment load (SSL) of flow discharge. However, in most cases the data log-transformation in SRC models causing a bias which underestimates SSL prediction. In this study, using the daily stream flow and suspended sediment load data from Shalman hydrometric station on Shalmanroud River, Guilan Pro...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013